AITopics | style reward

Collaborating Authors

style reward

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Manipulate as Human: Learning Task-oriented Manipulation Skills by Adversarial Motion Priors

Ma, Ziqi, Tian, Changda, Gao, Yue

arXiv.org Artificial IntelligenceOct-29-2025

In recent years, there has been growing interest in developing robots and autonomous systems that can interact with human in a more natural and intuitive way. One of the key challenges in achieving this goal is to enable these systems to manipulate objects and tools in a manner that is similar to that of humans. In this paper, we propose a novel approach for learning human-style manipulation skills by using adversarial motion priors, which we name HMAMP. The approach leverages adversarial networks to model the complex dynamics of tool and object manipulation, as well as the aim of the manipulation task. The discriminator is trained using a combination of real-world data and simulation data executed by the agent, which is designed to train a policy that generates realistic motion trajectories that match the statistical properties of human motion. We evaluated HMAMP on one challenging manipulation task: hammering, and the results indicate that HMAMP is capable of learning human-style manipulation skills that outperform current baseline methods. Additionally, we demonstrate that HMAMP has potential for real-world applications by performing real robot arm hammering tasks. In general, HMAMP represents a significant step towards developing robots and autonomous systems that can interact with humans in a more natural and intuitive way, by learning to manipulate tools and objects in a manner similar to how humans do.

artificial intelligence, machine learning, manipulation task, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1017/S0263574725001444

2510.24257

Country:

Asia (0.29)
Europe > Switzerland (0.28)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Guiding Energy-Efficient Locomotion through Impact Mitigation Rewards

Wang, Chenghao, Viswanathan, Arjun, Sihite, Eric, Ramezani, Alireza

arXiv.org Artificial IntelligenceOct-14-2025

Animals achieve energy-efficient locomotion by their implicit passive dynamics, a marvel that has captivated roboticists for decades.Recently, methods incorporated Adversarial Motion Prior (AMP) and Reinforcement learning (RL) shows promising progress to replicate Animals' naturalistic motion. However, such imitation learning approaches predominantly capture explicit kinematic patterns, so-called gaits, while overlooking the implicit passive dynamics. This work bridges this gap by incorporating a reward term guided by Impact Mitigation Factor (IMF), a physics-informed metric that quantifies a robot's ability to passively mitigate impacts. By integrating IMF with AMP, our approach enables RL policies to learn both explicit motion trajectories from animal reference motion and the implicit passive dynamic. We demonstrate energy efficiency improvements of up to 32%, as measured by the Cost of Transport (CoT), across both AMP and handcrafted reward structure.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2510.09543

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Add feedback

Constrained Style Learning from Imperfect Demonstrations under Task Optimality

Wen, Kehan, Li, Chenhao, He, Junzhe, Hutter, Marco

arXiv.org Artificial IntelligenceSep-24-2025

Learning from demonstration has proven effective in robotics for acquiring natural behaviors, such as stylistic motions and lifelike agility, particularly when explicitly defining style-oriented reward functions is challenging. Synthesizing stylistic motions for real-world tasks usually requires balancing task performance and imitation quality. Existing methods generally depend on expert demonstrations closely aligned with task objectives. However, practical demonstrations are often incomplete or unrealistic, causing current methods to boost style at the expense of task performance. To address this issue, we propose formulating the problem as a constrained Markov Decision Process (CMDP). Specifically, we optimize a style-imitation objective with constraints to maintain near-optimal task performance. We introduce an adaptively adjustable Lagrangian multiplier to guide the agent to imitate demonstrations selectively, capturing stylistic nuances without compromising task performance. We validate our approach across multiple robotic platforms and tasks, demonstrating both robust task performance and high-fidelity style learning. On ANYmal-D hardware we show a 14.5% drop in mechanical energy and a more agile gait pattern, showcasing real-world benefits.

artificial intelligence, machine learning, task performance, (16 more...)

arXiv.org Artificial Intelligence

2507.09371

Country: Asia (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AcL: Action Learner for Fault-Tolerant Quadruped Locomotion Control

Xu, Tianyu, Cheng, Yaoyu, Shen, Pinxi, Zhao, Lin

arXiv.org Artificial IntelligenceMar-28-2025

-- Quadrupedal robots can learn versatile locomotion skills but remain vulnerable when one or more joints lose power . In contrast, dogs and cats can adopt limping gaits when injured, demonstrating their remarkable ability to adapt to physical conditions. Inspired by such adaptability, this paper presents Action Learner (AcL), a novel teacher-student reinforcement learning framework that enables quadrupeds to autonomously adapt their gait for stable walking under multiple joint faults. Unlike conventional teacher-student approaches that enforce strict imitation, AcL leverages teacher policies to generate style rewards, guiding the student policy without requiring precise replication. We train multiple teacher policies, each corresponding to a different fault condition, and subsequently distill them into a single student policy with an encoder-decoder architecture. While prior works primarily address single-joint faults, AcL enables quadrupeds to walk with up to four faulty joints across one or two legs, autonomously switching between different limping gaits when faults occur . Quadruped robots are gaining popularity as versatile mobile platforms capable of navigating diverse terrains and performing robust locomotion tasks such as search and rescue operations in buildings, cargo delivery in cities, and planetary exploration. In such scenarios, quadrupeds may encounter faults that cannot be immediately repaired, requiring them to continue their tasks despite the malfunction.

artificial intelligence, machine learning, teacher policy, (18 more...)

arXiv.org Artificial Intelligence

2503.21401

Country:

Asia > Singapore > Central Region > Singapore (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.89)

Add feedback

Latent Action Priors From a Single Gait Cycle Demonstration for Online Imitation Learning

Hausdörfer, Oliver, von Rohr, Alexander, Lefort, Éric, Schoellig, Angela

arXiv.org Artificial IntelligenceOct-4-2024

Deep Reinforcement Learning (DRL) in simulation often results in brittle and unrealistic learning outcomes. To push the agent towards more desirable solutions, prior information can be injected in the learning process through, for instance, reward shaping, expert data, or motion primitives. We propose an additional inductive bias for robot learning: latent actions learned from expert demonstration as priors in the action space. We show that these action priors can be learned from only a single open-loop gait cycle using a simple autoencoder. Using these latent action priors combined with established style rewards for imitation in DRL achieves above expert demonstration level of performance and leads to more desirable gaits. Further, action priors substantially improve the performance on transfer tasks, even leading to gait transitions for higher target speeds. Videos and code are available at https://sites.google.com/view/latent-action-priors.

demonstration, expert demonstration, style reward, (15 more...)

arXiv.org Artificial Intelligence

2410.03246

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Europe > Belgium > Flanders (0.04)

Genre:

Instructional Material > Online (0.40)
Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

From Words to Wheels: Automated Style-Customized Policy Generation for Autonomous Driving

Han, Xu, Chen, Xianda, Cai, Zhenghan, Cai, Pinlong, Zhu, Meixin, Chu, Xiaowen

arXiv.org Artificial IntelligenceSep-18-2024

Autonomous driving technology has witnessed rapid advancements, with foundation models improving interactivity and user experiences. However, current autonomous vehicles (AVs) face significant limitations in delivering command-based driving styles. Most existing methods either rely on predefined driving styles that require expert input or use data-driven techniques like Inverse Reinforcement Learning to extract styles from driving data. These approaches, though effective in some cases, face challenges: difficulty obtaining specific driving data for style matching (e.g., in Robotaxis), inability to align driving style metrics with user preferences, and limitations to pre-existing styles, restricting customization and generalization to new commands. This paper introduces Words2Wheels, a framework that automatically generates customized driving policies based on natural language user commands. Words2Wheels employs a Style-Customized Reward Function to generate a Style-Customized Driving Policy without relying on prior driving data. By leveraging large language models and a Driving Style Database, the framework efficiently retrieves, adapts, and generalizes driving styles. A Statistical Evaluation module ensures alignment with user preferences. Experimental results demonstrate that Words2Wheels outperforms existing methods in accuracy, generalization, and adaptability, offering a novel solution for customized AV driving behavior. Code and demo available at https://yokhon.github.io/Words2Wheels/.

autonomous driving, user command, words2wheel, (13 more...)

arXiv.org Artificial Intelligence

2409.11694

Country:

Asia > China > Guangdong Province > Guangzhou (0.05)
Asia > China > Tianjin Province > Tianjin (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(3 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback